The core numerical algorithms that need to scale are the FFT and dense linear algebra. The FFT is limited in scalability to 512 processors per k-point. Fortunately, there are many k-points and they can be computed independently.
Modifying the assignment of tasks to processors increased the performance by 64%! FLOPS are free, it's the communication patterns that are determine performance.
One research problem is improving the scalability of computations for small systems (ie, 32 water molecules), so they can be simulated for longer times.
No comments:
Post a Comment